On using MLP features in LVCSR

نویسندگان

  • Qifeng Zhu
  • Barry Y. Chen
  • Nelson Morgan
  • Andreas Stolcke
چکیده

One of the major research thrusts in the speech group at ICSI is to use Multi-Layer Perceptron (MLP) based features in automatic speech recognition (ASR). This paper presents a study of three aspects of this effort: 1) the properties of the MLP features which make them useful, 2) incorporating MLP features together with PLP features in ASR, and 3) possible redundancy between MLP features and more conventional system refinements such as discriminative training and system combination. The paper shows that MLP transformations yield variables that have regular distributions, which can be further modified by using logarithm to make the distribution easier to model by a Gaussian-HMM. Two or more vectors of these features can easily be combined without increasing the feature dimension. Recognition results show that MLP features can significantly improve recognition performance in large vocabulary continuous speech recognition (LVCSR) tasks for the NIST 2001 Hub-5 evaluation set with models trained on the Switchboard Corpus, even when discriminative training and system combination are used.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study on Speaker Normalized MLP Features in LVCSR

Different normalization methods are applied in recent Large Vocabulary Continuous Speech Recognition Systems (LVCSR) to reduce the influence of speaker variability on the acoustic models. In this paper we investigate the use of Vocal Tract Length Normalization (VTLN) and Speaker Adaptive Training (SAT) in Multi Layer Perceptron (MLP) feature extraction on an English task. We achieve significant...

متن کامل

Context-Dependent MLPs for LVCSR: TANDEM, Hybrid or Both?

Gaussian Mixture Model (GMM) and Multi Layer Perceptron (MLP) based acoustic models are compared on a French large vocabulary continuous speech recognition (LVCSR) task. In addition to optimizing the output layer size of the MLP, the effect of the deep neural network structure is also investigated. Moreover, using different linear transformations (time derivatives, LDA, CMLLR) on conventional M...

متن کامل

(Deep) Neural Networks

This work continues in development of the recently proposed Bottle-Neck features for ASR. A five-layers MLP used in bottleneck feature extraction allows to obtai arbitrary feature size without dimensionality reduction by transforms, independently on the MLP training targets. The MLP topology – number and sizes of layers, suitable training targets, the impact of output feature transforms, the ne...

متن کامل

Analysis and Comparison of Recent MLP Features for LVCSR Systems

MLP based front-ends have evolved in different ways in recent years beyond the seminal TANDEM-PLP features. This paper aims at providing a fair comparison of these recent progresses including the use of different long/short temporal inputs (PLP,MRASTA,wLP-TRAPS,DCT-TRAPS) and the use of complex architectures (bottleneck, hierarchy, multistream) that go beyond the conventional three layer MLP. F...

متن کامل

Data-driven clustered hierarchical tandem system for LVCSR

In tandem systems, the outputs of multi-layer perceptron (MLP) classifiers have been successfully used as features for HMM-based automatic speech recognition. In this paper, we propose a data-driven clustered hierarchical tandem system that yields improved performance on a large-vocabulary broadcast news transcription task. The complicated global learning for a large monolithic MLP classifier i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004